fusion operation
Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models
Zhang, Kexin, Lyu, Fuyuan, Tang, Xing, Liu, Dugang, Ma, Chen, Ding, Kaize, He, Xiuqiang, Liu, Xue
The evolution of previous Click-Through Rate (CTR) models has mainly been driven by proposing complex components, whether shallow or deep, that are adept at modeling feature interactions. However, there has been less focus on improving fusion design. Instead, two naive solutions, stacked and parallel fusion, are commonly used. Both solutions rely on pre-determined fusion connections and fixed fusion operations. It has been repetitively observed that changes in fusion design may result in different performances, highlighting the critical role that fusion plays in CTR models. While there have been attempts to refine these basic fusion strategies, these efforts have often been constrained to specific settings or dependent on specific components. Neural architecture search has also been introduced to partially deal with fusion design, but it comes with limitations. The complexity of the search space can lead to inefficient and ineffective results. To bridge this gap, we introduce OptFusion, a method that automates the learning of fusion, encompassing both the connection learning and the operation selection. We have proposed a one-shot learning algorithm tackling these tasks concurrently. Our experiments are conducted over three large-scale datasets. Extensive experiments prove both the effectiveness and efficiency of OptFusion in improving CTR model performance. Our code implementation is available here\url{https://github.com/kexin-kxzhang/OptFusion}.
Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost
Gao, Yuan, Zhang, Weizhong, Luo, Wenhan, Ma, Lin, Yu, Jin-Gang, Xia, Gui-Song, Ma, Jiayi
We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance which we focus on, while preserving a single task inference cost of the primary task. While most existing auxiliary learning methods are optimization-based relying on loss weights/gradients manipulation, our method is architecture-based with a flexible asymmetric structure for the primary and auxiliary tasks, which produces different networks for training and inference. Specifically, starting from two single task networks/branches (each representing a task), we propose a novel method with evolving networks where only primary-to-auxiliary links exist as the cross-task connections after convergence. These connections can be removed during the primary task inference, resulting in a single-task inference cost. We achieve this by formulating a Neural Architecture Search (NAS) problem, where we initialize bi-directional connections in the search space and guide the NAS optimization converging to an architecture with only the single-side primary-to-auxiliary connections. Moreover, our method can be incorporated with optimization-based auxiliary learning approaches. In this paper, we tackle the practical issue of auxiliary learning, which involves improving the performance of a specific task (i.e., the primary task) while incorporating additional auxiliary labels from different tasks (i.e., the auxiliary tasks). We aim to efficiently leverage these auxiliary labels to enhance the primary task's performance while maintaining a comparable computational and parameter cost to a single-task network when evaluating the primary task.
Multi-Source Fusion Operations in Subjective Logic
van der Heijden, Rens Wouter, Kopp, Henning, Kargl, Frank
The purpose of multi-source fusion is to combine information from more than two evidence sources, or subjective opinions from multiple actors. For subjective logic, a number of different fusion operators have been proposed, each matching a fusion scenario with different assumptions. However, not all of these operators are associative, and therefore multi-source fusion is not well-defined for these settings. In this paper, we address this challenge, and define multi-source fusion for weighted belief fusion (WBF) and consensus & compromise fusion (CCF). For WBF, we show the definition to be equivalent to the intuitive formulation under the bijective mapping between subjective logic and Dirichlet evidence PDFs. For CCF, since there is no independent generalization, we show that the resulting multi-source fusion produces valid opinions, and explain why our generalization is sound. For completeness, we also provide corrections to previous results for averaging and cumulative belief fusion (ABF and CBF), as well as belief constraint fusion (BCF), which is an extension of Dempster's rule. With our generalizations of fusion operators, fusing information from multiple sources is now well-defined for all different fusion types defined in subjective logic. This enables wider applicability of subjective logic in applications where multiple actors interact.
Dual Recurrent Attention Units for Visual Question Answering
We propose an architecture for VQA which utilizes recurrent layers to generate visual and textual attention. The memory characteristic of the proposed recurrent attention units offers a rich joint embedding of visual and textual features and enables the model to reason relations between several parts of the image and question. Our single model outperforms the first place winner on the VQA 1.0 dataset, performs within margin to the current state-of-the-art ensemble model. We also experiment with replacing attention mechanisms in other state-of-the-art models with our implementation and show increased accuracy. In both cases, our recurrent attention mechanism improves performance in tasks requiring sequential or relational reasoning on the VQA dataset.